A PDF is not enough: Crowdsourcing the T1 mapping common ground via the ISMRM reproducibility challenge

*Mathieu Boudreau1,2, *Agah Karakuzu1, Julien Cohen-Adad1,3,4,5, Ecem Bozkurt6, Madeline Carr7,8, Marco Castellaro9, Luis Concha10, Mariya Doneva11, Seraina Dual12, Alex Ensworth13,14, Alexandru Foias1, Véronique Fortier15,16, Refaat E. Gabr17, Guillaume Gilbert18, Carri K. Glide-Hurst19, Matthew Grech-Sollars20,21, Siyuan Hu22, Oscar Jalnefjord23,24, Jorge Jovicich25, Kübra Keskin6, Peter Koken11, Anastasia Kolokotronis13,26, Simran Kukran27,28, Nam. G. Lee6, Ives R. Levesque13,29, Bochao Li6, Dan Ma22, Burkhard Mädler30, Nyasha Maforo31,32, Jamie Near33,34, Erick Pasaye10, Alonso Ramirez-Manzanares35, Ben Statton36,Christian Stehning30, Stefano Tambalo25, Ye Tian6, Chenyang Wang37, Kilian Weiss30, Niloufar Zakariaei38, Shuo Zhang30, Ziwei Zhao6, Nikola Stikov1,2,39

  • *Authors MB and AK contributed equally to this work

1NeuroPoly Lab, Polytechnique Montréal, Montreal, Quebec, Canada, 2Montreal Heart Institute, Montreal, Quebec, Canada, 3Unité de Neuroimagerie Fonctionnelle (UNF), Centre de recherche de l’Institut Universitaire de Gériatrie de Montréal (CRIUGM), Montreal, Quebec, Canada, 4Mila - Quebec AI Institute, Montreal, QC, Canada, 5Centre de recherche du CHU Sainte-Justine, Université de Montréal, Montreal, QC, Canada, 6Magnetic Resonance Engineering Laboratory (MREL), University of Southern California, Los Angeles, California, USA, 7Medical Physics, Ingham Institute for Applied Medical Research, Liverpool, Australia, 8Department of Medical Physics, Liverpool and Macarthur Cancer Therapy Centres, Liverpool, Australia, 9Department of Information Engineering, University of Padova, Padova, Italy, 10Institute of Neurobiology, Universidad Nacional Autónoma de México Campus Juriquilla, Querétaro, México, 11Philips Research Hamburg, Germany, 12Stanford University, Stanford, California, United States, 13Medical Physics Unit, McGill University, Montreal, Canada, 14University of British Columbia, Vancouver, Canada, 15Department of Medical Imaging, McGill University Health Centre, Montreal, Quebec, Canada 16Department of Radiology, McGill University, Montreal, Quebec, Canada, 17Department of Diagnostic and Interventional Imaging, University of Texas Health Science Center at Houston, McGovern Medical School, Houston, Texas, USA, 18MR Clinical Science, Philips Canada, Mississauga, Ontario, Canada, 19Department of Human Oncology, University of Wisconsin-Madison, Madison, Wisconsin, USA, 20Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK, 21Lysholm Department of Neuroradiology, National Hospital for Neurology and Neurosurgery, University College London Hospitals NHS Foundation Trust, London, UK, 22Department of Biomedical Engineering, Case Western Reserve University, Cleveland, Ohio, USA, 23Department of Medical Radiation Sciences, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden, 24Biomedical Engineering, Sahlgrenska University Hospital, Gothenburg, Sweden, 25Center for Mind/Brain Sciences, University of Trento, Italy, 26Hopital Maisonneuve-Rosemont, Montreal, Canada, 27Bioengineering, Imperial College London, UK, 28Radiotherapy and Imaging, Insitute of Cancer Research, Imperial College London, UK, 29Research Institute of the McGill University Health Centre, Montreal, Canada, 30Clinical Science, Philips Healthcare, Germany, 31Department of Radiological Sciences, University of California Los Angeles, Los Angeles, CA, USA, 32Physics and Biology in Medicine IDP, University of California Los Angeles, Los Angeles, CA, USA, 33Douglas Brain Imaging Centre, Montreal, Canada, 34Sunnybrook Research Institute, Toronto, Canada, 35Computer Science Department, Centro de Investigación en Matemáticas, A.C., Guanajuato, México, 36Medical Research Council, London Institute of Medical Sciences, Imperial College London, London, United Kingdom, 37Department of Radiation Oncology - CNS Service, The University of Texas MD Anderson Cancer Center, Texas, USA, 38Department of Biomedical Engineering, University of British Columbia, British Columbia, Canada, 39Center for Advanced Interdisciplinary Research, Ss. Cyril and Methodius University, Skopje, North Macedonia

Abstract#

Purpose: T1 mapping is a widely used quantitative MRI technique, but its tissue-specific values remain inconsistent across protocols, sites, and vendors. The ISMRM Reproducible Research study group (RRSG) and Quantitative MR study group (qMRSG) jointly launched a T1 mapping reproducibility challenge to assess the reproducibility of a well-established inversion recovery T1 mapping technique, published solely as a PDF, on a standardized phantom and in healthy human brains.

Methods: The challenge used the acquisition protocol and fitting algorithm from Barral et al. 2010. Participants collected T1 mapping data on the ISMRM/NIST phantom and/or in healthy human brains. Data submission, pipeline development, and analysis were conducted using open-source platforms. Inter-submission and intra-submission comparisons were performed using one dataset per submission.

Results: Eighteen submissions were accepted using data collected with three MRI manufacturers, primarily at 3T (with one submission at 0.35T). The study collected 39 phantom and 56 human datasets. The mean coefficient of variation (CoV) was 6.1% for inter-submission phantom measurements, which was nearly twice as high as the intra-submission CoV (2.9%). For human data, inter-/intra-submission CoV was 5.9/3.2 % in the genu of the corpus callosum and 16/6.9 % in the cortical gray matter. To facilitate broader community access and engagement, an interactive dashboard was developed and is available athttps://rrsg2020.dashboards.neurolibre.org.

Conclusion: The inter-submission variability was twice as high as the intra-submission variability in both phantom and human brain T1 measurements, indicating that the published PDF was not sufficient to reproduce a quantitative MRI protocol.

Dashboard: Challenge Submissions

1     |     INTRODUCTION#

Significant challenges exist in the reproducibility of quantitative MRI (qMRI) [1]. Despite its promise of improving the specificity and reproducibility of MRI acquisitions, few qMRI techniques have been integrated into clinical practice. Even the most fundamental MR parameters cannot be measured with sufficient reproducibility and precision across clinical scanners to pass the second of six stages of technical assessment for clinical biomarkers [2–4]. Half a century has passed since the first quantitative T1 (spin-lattice relaxation time) measurements were first reported as a potential biomarker for tumors [5], followed shortly thereafter by the first in vivo T1 maps [6] of tumors, but there is still disagreement in reported values for this fundamental parameter across different sites, vendors, and implementations [7].

Among fundamental MRI parameters, T1 holds significant importance [8]. T1 represents the time constant for recovery of equilibrium longitudinal magnetization. T1 values will vary depending on the molecular mobility and magnetic field strength [9–11]. Knowledge of the T1 values for tissue is crucial for optimizing clinical MRI sequences for contrast and time efficiency [12–14] and to calibrate other quantitative MRI techniques [15,16]. Inversion recovery (IR) [17,18] is considered the gold standard for T1 measurement due to its robustness, but its long acquisition times limit the clinical use of IR for T1 mapping [7]. In practice, IR is often used as a reference for validating other T1 mapping techniques, such as variable flip angle imaging (VFA) [19–21], Look-Locker [22–24], and MP2RAGE [25,26].

In ongoing efforts to standardize T1 mapping methods, researchers have been actively developing quantitative MRI phantoms [27]. The International Society for Magnetic Resonance in Medicine (ISMRM) and the National Institute of Standards and Technology (NIST) collaborated on a standard system phantom [28], which was subsequently commercialized (Premium System Phantom, CaliberMRI, Boulder, Colorado). This phantom has since been used in large multicenter studies, such as Bane et al. [29] which concluded that acquisition protocol and field strength influence accuracy, repeatability, and interplatform reproducibility. Another NIST-led study [30] found no significant T1 discrepancies among measurements using NIST protocols across 27 MRI systems from three vendors at two clinical field strengths.

The 2020 ISMRM reproducibility challenge posed the following question: can an imaging protocol, independently implemented at multiple centers, consistently measure one of the fundamental MRI parameters (T1)? To assess this, we proposed using inversion recovery on a standardized phantom (ISMRM/NIST system phantom) and the healthy human brain. Specifically, this challenge explored whether the information provided in a published PDF of a seminal paper on T1 mapping [31] is sufficient to ensure the reproducibility across independent research imaging groups.

2     |     METHODS#

2.1     |     Phantom and human data#

The challenge asked participants with access to the ISMRM/NIST system phantom [28] (Premium System Phantom, CaliberMRI, Boulder, Colorado) to measure T1 maps of the phantom’s T1 plate (Table 1). Researchers that participated in the challenge were instructed to record the temperature before and after scanning the phantom using the phantom’s internal thermometer. Instructions for positioning and setting up the phantom were devised by NIST and were provided to participants through their website . In brief, the instructions explained how to orient the phantom and how long the phantom should be in the scanner room prior to scanning to achieve thermal equilibrium.

Table 1 Reference T1 values of the NiCl2 array of the standard system phantom (for both phantom versions) measured at 20 °C and 3T. Phantoms with serial numbers 0042 or less are referred to as “Version 1”, and those 0043 or greater are “Version 2”.#

Sphere

Version 1 (ms)

Version 2 (ms)

1

1989 ± 1.0

1883.97 ± 30.32

2

1454 ± 2.5

1330.16 ± 20.41

3

984.1 ± 0.33

987.27 ± 14.22

4

706 ± 1.0

690.08 ± 10.12

5

496.7 ± 0.41

484.97 ± 7.06

6

351.5 ± 0.91

341.58 ± 4.97

7

247.13 ± 0.086

240.86 ± 3.51

8

175.3 ± 0.11

174.95 ± 2.48

9

125.9 ± 0.33

121.08 ± 1.75

10

89.0 ± 0.17

85.75 ± 1.24

11

62.7 ± 0.13

60.21 ± 0.87

12

44.53 ± 0.090

42.89 ± 0.44

13

30.84 ± 0.016

30.40 ± 0.62

14

21.719 ± 0.005

21.44 ± 0.31

Participants were also instructed to collect T1 maps in healthy human brains, and were asked to measure a single slice positioned parallel to the anterior commissure - posterior commissure (AC-PC) line. Prior to imaging, the participants consented to share their de-identified data with the challenge organizers and on the Open Science Framework (OSF.io) website. As the submitted data was a single slice, the researchers were not instructed to de-face the data of their participants. Researchers submitting human data provided written confirmation to the organizers that their data was acquired in accordance with their institutional ethics committee (or equivalent regulatory body) and that the subjects had consented to data sharing as outlined in the challenge.

2.2     |     MRI acquisition protocol#

Participants followed the inversion recovery T1 mapping protocol optimized for the human brain as described in the published PDF [31], which consisted of: TR = 2550 ms, TIs = 50, 400, 1100, 2500 ms, TE = 14 ms, 2 mm slice thickness and 1×1 mm2 in-plane resolution. Note that this protocol is not suitable for fitting models that assume TR > 5T1. Instead, the more general Barral et al. [31] fitting model described in Section 2.4.1 can be used, and this model is compatible with both magnitude-only and complex data. Researchers were instructed to closely adhere to this protocol and report any deviations due to technical limitations.

2.3     |     Data Submissions#

Data submissions for the challenge were handled through a GitHub repository https://github.com/rrsg2020/data_submission, enabling a standardized and transparent process. All datasets were converted to the NIfTI format, and images for all TIs were concatenated along the fourth dimension. Each submission included a YAML file to store additional information (submitter details, acquisition details, and phantom or human subject details). Submissions were reviewed, and following acceptance the datasets were uploaded to OSF.io (osf.io/ywc9g/). A Jupyter Notebook [32,33] pipeline using qMRLab [34,35] was used to process the T1 maps and to conduct quality-control checks. MyBinder links to Jupyter notebooks that reproduced each T1 map were shared in each respective submission GitHub issue to easily reproduce the results in web browsers while maintaining consistent computational environments. Eighteen submissions were included in the analysis, which resulted in 39 T1 maps of the NIST/system phantom, and 56 brain T1 maps. Figure 1 illustrates all the submissions that acquired phantom data (Figure 1-a) and human data (Figure 1-b), the MRI scanner vendors, and the resulting T1 mapping datasets. Some submissions included measurements where both complex and magnitude-only data from the same acquisition were used to fit T1 maps, thus the total number of unique acquisitions is lower than the numbers reported above (27 for phantom data and 44 for human data). The datasets were collected on systems from three MRI manufacturers (Siemens, GE, Philips) and were acquired at 3T , except for one dataset acquired at 0.35T (the ViewRay MRidian MR-linac).

_images/figure_1.png

2.4     |     Fitting Model and Pipeline#

A reduced-dimension non-linear least squares (RD-NLS) approach was used to fit the complex general inversion recovery signal equation:

(1)#\[S(TI) = a + be^{-TI/T_1}\]

where a and b are complex constants. This approach, developed by Barral et al. [31], offers a model for the general T1 signal equation without relying on the long-TR approximation. The a and b constants inherently factor TR in them, as well as other imaging parameters such as excitation and inversion pulse flip angles, TE, etc. Barral et al. [31] shared their MATLAB (MathWorks, Natick, MA) code for the fitting algorithm used in their paper . Magnitude-only data were fitted to a modified version of Eq. 1 (Eq. 15 of Barral et al. 2010) with signal-polarity restoration by finding the signal minima, fitting the inversion recovery curve for two cases (data points for TI < TIminimum flipped, and data points for TI ≤ TIminimum flipped), and selecting the case that resulted in the best fit. This code is available as part of the open-source software qMRLab [34,35], which provides a standardized application program interface (API) to call the fitting in MATLAB/Octave scripts.

A data processing pipeline was written using MATLAB/Octave in a Jupyter Notebook. This pipeline downloads every dataset from osf.io (osf.io/ywc9g/), loads their configuration file, fits the T1 maps, and then saves them to NIfTI and PNG formats. The code is available on GitHub (https://github.com/rrsg2020/t1_fitting_pipeline, filename: RRSG_T1_fitting.ipynb). Finally, T1 maps were manually uploaded to OSF (https://osf.io/ywc9g/).

2.5     |     Image Labeling & Registration#

The T1 plate (NiCl2 array) of the phantom has 14 spheres that were labeled as the regions-of-interest (ROI) using a numerical mask template created in MATLAB, provided by NIST researchers (Figure 1-c). To avoid potential edge effects in the T1 maps, the ROI labels were reduced to 60% of the expected sphere diameter. A registration pipeline in Python using the Advanced Normalization Tools (ANTs) [36] was developed and shared in the analysis repository of our GitHub organization (https://github.com/rrsg2020/analysis, filename: register_t1maps_nist.py, commit ID: 8d38644). Briefly, a label-based registration was first applied to obtain a coarse alignment, followed by an affine registration (gradientStep: 0.1, metric: cross correlation, number of steps: 3, iterations: 100/100/100, smoothness: 0/0/0, sub-sampling: 4/2/1) and a BSplineSyN registration (gradientStep:0.5, meshSizeAtBaseLevel:3, number of steps: 3, iterations: 50/50/10, smoothness: 0/0/0, sub-sampling: 4/2/1). The ROI labels template was nonlinearly registered to each T1 map uploaded to OSF.

For human data, manual ROIs were segmented by a single researcher (M.B., 11+ years of neuroimaging experience) using FSLeyes [37] in four regions (Figure 1-d): located in the genu, splenium, deep gray matter, and cortical gray matter. Automatic segmentation was not used because the data were single-slice and there was inconsistent slice positioning between datasets.

2.6     |     Analysis and Statistics#

Analysis code and scripts were developed and shared in a version-controlled public GitHub repository . The T1 fitting and data analysis were performed by M.B., one of the challenge organizers. Computational environment requirements were containerized in Docker [38,39] to create an executable environment that allows for analysis reproduction in a web browser via MyBinder [40]. Backend Python files handled reference data, database operations, ROI masking, and general analysis tools. Configuration files handled dataset information, and the datasets were downloaded and pooled using a script (make_pooled_datasets.py). The databases were created using a reproducible Jupyter Notebook script and subsequently saved in the repository.

The mean T1 values of the ISMRM/NIST phantom data for each ROI were compared with temperature-corrected reference values and visualized in three different types of plots (linear axes, log-log axes, and error relative to the reference value). Temperature correction involved nonlinear interpolation of a NIST reference table of T1 values for temperatures ranging from 16 °C to 26 °C (2 °C intervals) as specified in the phantom’s technical specifications. For the human datasets, the mean and standard deviations for each tissue ROI were calculated from all submissions across all sites. All quality assurance and analysis plot images were stored in the repository. Additionally, the database files of ROI values and acquisition details for all submissions were also stored in the repository.

3     |     RESULTS#

3.1     |     Dashboard#

To disseminate the challenge results, a web-based dashboard was developed (Figure 2, https://rrsg2020.dashboards.neurolibre.org). The landing page (Figure 2-a) showcases the relationship between the phantom and brain datasets acquired at different sites/vendors. Navigating to the phantom section leads to the information about the submitted datasets, such as the mean/std/median/CoV for each sphere, % difference from the reference values, number of scans, and temperature (Figure 2-b, left). Other options allow users to limit the results by specific versions of the phantom or the MRI manufacturer. Selecting either “By Sphere” (Figure 2-b, right) or “By Site” tabs will display whisker plots for the selected options, enabling further exploration of the datasets.

Returning to the home page and selecting the brain section allows exploration of information on the brain datasets (Figure 2-c, left), such as mean T1 and STD for different ROI regions, as well as selection of specific MRI manufacturers. Choosing the “By Regions” tab provides whisker plots of the datasets for the selected ROI (Figure 2-c, right), similar to the plots for the phantom.

Figure 2 Dashboard. a) welcome page listing all the sites, the types of subject, and scanner, and the relationship between the three. Row b) shows two of the phantom dashboard tabs, and row c) shows two of the human data dashboard tabs Link: https://rrsg2020.dashboards.neurolibre.org

3.2     |     Submissions#

Eighteen submissions were included in the analysis, which resulted in 38 T1 maps of the NIST/system phantom, and 56 brain T1 maps. Figure 3 illustrates all the submissions that acquired phantom data (Figure 3-a) and human data (Figure 3-b), the number of scanners used for each submission, and the number of T1 mapping datasets. It should be noted that these numbers include a subset of measurements where both complex and magnitude-only data from the same acquisition were used to fit T1 maps, thus the total number of unique acquisitions is lower than the numbers reported above (27 for phantom data and 44 for human data). The datasets were collected on three MRI manufacturers (Siemens, GE, Philips) and were acquired at 3T 1, except for one dataset acquired at 0.35T (the ViewRay MRidian MR-linac) . To showcase the heterogeneity of the actual T1 map data from the independently-implemented submissions, Figure 4 displays six T1 maps of the phantoms submitted to the challenge.

images/figure_3_full.png

Figure 3 Complete list of the datasets submitted to the challenge. Submissions that included phantom data are shown in a), and those that included human brain data are shown in b). Submissions were assigned numbers to keep track of which submissions included both phantom and human data. Some submissions included datasets acquired on multiple scanners. For the phantom (panel a), each submission acquired all their data using a single phantom, however some researchers shared the same physical phantom with each other (same color). Some additional details about the datasets are included in the T1 maps column, if relevant. Note that for complex datasets in the magnitude/phase format, T1 maps were calculated both using magnitude-only data and complex-data, but these were from the same measurement (branching off arrow).

Of these datasets, several submissions went beyond the minimum acquisition and acquired additional datasets using the ISMRM/NIST phantom, such as a traveling phantom (7 scanners/sites), scan-rescan, same-day rescans on two MRIs, short TR vs long TR, and 4 point TI vs 14 point TI. For humans, one site acquired 13 subjects on two scanners (different manufacturers), one site acquired 6 subjects, and one site acquired a subject using two different head coils (20 channels vs. 64 channels).

from os import path
from pathlib import Path
import os
from repo2data.repo2data import Repo2Data

if build == 'latest':
    if path.isdir('analysis')== False:
        !git clone https://github.com/rrsg2020/analysis.git
        dir_name = 'analysis'
        analysis = os.listdir(dir_name)

        for item in analysis:
            if item.endswith(".ipynb"):
                os.remove(os.path.join(dir_name, item))
            if item.endswith(".md"):
                os.remove(os.path.join(dir_name, item))
elif build == 'archive':
    if os.path.isdir(Path('../../data')):
        data_path = ['../../data/rrsg-2020-neurolibre']
    else:
        # define data requirement path
        data_req_path = os.path.join("..", "binder", "data_requirement.json")
        # download data
        repo2data = Repo2Data(data_req_path)
        data_path = repo2data.install()

# Imports

from pathlib import Path
import pandas as pd
import json
import nibabel as nib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import Video
import glob
from analysis.src.plots import *
from analysis.make_pooled_datasets import *

# Configurations
if build == 'latest':
    configFile = Path('analysis/configs/3T_NIST_T1maps.json')
    data_folder_name = 'analysis/3T_NIST_T1maps'

    configFile_raw = Path('analysis/configs/3T_NIST.json')
    data_folder_name_raw = 'analysis/3T_NIST'
elif build=='archive':
    configFile = Path(data_path[0] + '/analysis/configs/3T_NIST_T1maps.json')
    data_folder_name = data_path[0] + '/analysis/3T_NIST_T1maps'

    configFile_raw = Path(data_path[0] + '/analysis/configs/3T_NIST.json')
    data_folder_name_raw = data_path[0] + '/analysis/3T_NIST'

# Download datasets
if not Path(data_folder_name).exists():
    print(Path(data_folder_name))
    make_pooled_dataset(configFile, data_folder_name)

if not Path(data_folder_name_raw).exists():
    make_pooled_dataset(configFile_raw, data_folder_name_raw)

with open(configFile) as json_file:
    configJson = json.load(json_file)

with open(configFile_raw) as json_file:
    configJson_raw = json.load(json_file)
    
def get_image(dataset_name, key2):
    # Load T1 image data
    t1_file = configJson[dataset_name]['datasets'][key2]['imagePath']
    t1 = nib.load(Path(data_folder_name) / t1_file)
    t1_volume = t1.get_fdata() 

    # Load raw image data
    raw_file = configJson_raw[dataset_name]['datasets'][key2]['imagePath']
    raw = nib.load(Path(data_folder_name_raw) / raw_file)
    raw_volume = raw.get_fdata() 

    # Handle 2D vs 3D volume case
    dims = t1_volume.shape
    if (len(dims) == 2) or (np.min(dims) == 1):
        im = np.rot90(t1_volume)
        TI_1 = np.rot90(np.squeeze(raw_volume[:,:,0,0]))
        TI_2 = np.rot90(np.squeeze(raw_volume[:,:,0,1]))
        TI_3 = np.rot90(np.squeeze(raw_volume[:,:,0,2]))
        TI_4 = np.rot90(np.squeeze(raw_volume[:,:,0,3]))    
    else:
        index_smallest_dim = np.argmin(dims)
        numberOfSlices = dims[index_smallest_dim]
        midSlice = int(np.round(numberOfSlices/2))

        if index_smallest_dim == 0:
            im = np.rot90(np.squeeze(t1_volume[midSlice,:,:]))
            TI_1 = np.rot90(np.squeeze(raw_volume[midSlice,:,:,0]))
            TI_2 = np.rot90(np.squeeze(raw_volume[midSlice,:,:,1]))
            TI_3 = np.rot90(np.squeeze(raw_volume[midSlice,:,:,2]))
            TI_4 = np.rot90(np.squeeze(raw_volume[midSlice,:,:,3]))    
        elif index_smallest_dim == 1:
            im = np.rot90(np.squeeze(t1_volume[:,midSlice,:]))
            TI_1 = np.rot90(np.squeeze(raw_volume[:,midSlice,:,0]))
            TI_2 = np.rot90(np.squeeze(raw_volume[:,midSlice,:,1]))
            TI_3 = np.rot90(np.squeeze(raw_volume[:,midSlice,:,2]))
            TI_4 = np.rot90(np.squeeze(raw_volume[:,midSlice,:,3]))
        elif index_smallest_dim == 2:
            im = np.rot90(np.squeeze(t1_volume[:,:,midSlice]))
            TI_1 = np.rot90(np.squeeze(raw_volume[:,:,midSlice,0]))
            TI_2 = np.rot90(np.squeeze(raw_volume[:,:,midSlice,1]))
            TI_3 = np.rot90(np.squeeze(raw_volume[:,:,midSlice,2]))
            TI_4 = np.rot90(np.squeeze(raw_volume[:,:,midSlice,3]))

    xAxis = np.linspace(0,im.shape[0]-1, num=im.shape[0])
    yAxis = np.linspace(0,im.shape[1]-1, num=im.shape[1])
    return im, xAxis, yAxis, TI_1, TI_2, TI_3, TI_4


im_1, xAxis_1, yAxis_1, TI_1_1, TI_2_1, TI_3_1, TI_4_1 = get_image('wang_MDAnderson_NIST', 'day2_mag')

im_2, xAxis_2, yAxis_2, TI_1_2, TI_2_2, TI_3_2, TI_4_2 = get_image('CStehningPhilipsClinicalScienceGermany_NIST', 'Bonn_MR1_magnitude')

im_3, xAxis_3, yAxis_3, TI_1_3, TI_2_3, TI_3_3, TI_4_3 = get_image('mrel_usc_NIST', 'Session1_MR1')

im_4, xAxis_4, yAxis_4, TI_1_4, TI_2_4, TI_3_4, TI_4_4 = get_image('karakuzu_polymtl_NIST', 'mni')

im_5, xAxis_5, yAxis_5, TI_1_5, TI_2_5, TI_3_5, TI_4_5 = get_image('madelinecarr_lha_NIST', 'one')

im_6, xAxis_6, yAxis_6, TI_1_6, TI_2_6, TI_3_6, TI_4_6 = get_image('matthewgrechsollars_ICL_NIST', 'magnitude')
im_6 = np.flipud(im_6)
TI_1_6 = np.flipud(TI_1_6)
TI_2_6 = np.flipud(TI_2_6)
TI_3_6 = np.flipud(TI_3_6)
TI_4_6 = np.flipud(TI_4_6)
# PYTHON CODE
# Module imports
import matplotlib.pyplot as plt
from PIL import Image
from matplotlib.image import imread
import scipy.io
import plotly.graph_objs as go
import numpy as np
from plotly import __version__
from plotly.offline import init_notebook_mode, iplot, plot
config={'showLink': False, 'displayModeBar': False, 'responsive': True}

init_notebook_mode(connected=True)

from IPython.display import display, HTML

import os
import markdown
import random
from scipy.integrate import quad

import warnings
warnings.filterwarnings('ignore')

xAxis = np.linspace(0,256*3-1, num=256*3)
yAxis = np.linspace(0,256*2-1, num=256*2)

# T1 maps
im_2_padded = np.pad(im_2,32)
images_1 = np.concatenate((im_1, im_5, im_3), axis=1)
images_2 = np.concatenate((im_4, im_2_padded, im_6), axis=1)
images = np.concatenate((images_2, images_1), axis=0)

# TI_1 maps
TI_1_2_padded = np.pad(TI_1_2,32)
TI_1_images_1 = np.concatenate((TI_1_1, TI_1_5, TI_1_3), axis=1)
TI_1_images_2 = np.concatenate((TI_1_4, TI_1_2_padded, TI_1_6), axis=1)
TI_1_images = np.concatenate((TI_1_images_2, TI_1_images_1), axis=0)

# TI_2 maps
TI_2_2_padded = np.pad(TI_2_2,32)
TI_2_images_1 = np.concatenate((TI_2_1, TI_2_5, TI_2_3), axis=1)
TI_2_images_2 = np.concatenate((TI_2_4, TI_2_2_padded, TI_2_6), axis=1)
TI_2_images = np.concatenate((TI_2_images_2, TI_2_images_1), axis=0)

# TI_3 maps
TI_3_2_padded = np.pad(TI_3_2,32)
TI_3_images_1 = np.concatenate((TI_3_1, TI_3_5, TI_3_3), axis=1)
TI_3_images_2 = np.concatenate((TI_3_4, TI_3_2_padded, TI_3_6), axis=1)
TI_3_images = np.concatenate((TI_3_images_2, TI_3_images_1), axis=0)

# TI_4 maps
TI_4_2_padded = np.pad(TI_4_2,32)
TI_4_images_1 = np.concatenate((TI_4_1, TI_4_5, TI_4_3), axis=1)
TI_4_images_2 = np.concatenate((TI_4_4, TI_4_2_padded, TI_4_6), axis=1)
TI_4_images = np.concatenate((TI_4_images_2, TI_4_images_1), axis=0)

trace1 = go.Heatmap(x = xAxis,
                   y = yAxis_1,
                   z=images,
                   zmin=0,
                   zmax=3000,
                   colorscale='viridis',
                   colorbar={"title": 'T<sub>1</sub> (ms)',
                             'titlefont': dict(
                                   family='Times New Roman',
                                   size=26,
                                   )
                            },
                   xaxis='x2',
                   yaxis='y2',
                   visible=True)

trace2 = go.Heatmap(x = xAxis,
                   y = yAxis_1,
                   z=TI_1_images,
                   zmin=0,
                   zmax=3000,
                   colorscale='gray',
                   colorbar={"title": 'T<sub>1</sub> (ms)',
                             'titlefont': dict(
                                   family='Times New Roman',
                                   size=26,
                                   color='white'
                                   )
                            },
                   xaxis='x2',
                   yaxis='y2',
                   visible=False)

trace3 = go.Heatmap(x = xAxis,
                   y = yAxis_1,
                   z=TI_2_images,
                   zmin=0,
                   zmax=3000,
                   colorscale='gray',
                   colorbar={"title": 'T<sub>1</sub> (ms)',
                             'titlefont': dict(
                                   family='Times New Roman',
                                   size=26,
                                   color='white'
                                   )
                            },
                   xaxis='x2',
                   yaxis='y2',
                   visible=False)

trace4 = go.Heatmap(x = xAxis,
                   y = yAxis_1,
                   z=TI_3_images,
                   zmin=0,
                   zmax=3000,
                   colorscale='gray',
                   colorbar={"title": 'T<sub>1</sub> (ms)',
                             'titlefont': dict(
                                   family='Times New Roman',
                                   size=26,
                                   color='white'
                                   )
                            },
                   xaxis='x2',
                   yaxis='y2',
                   visible=False)

trace5 = go.Heatmap(x = xAxis,
                   y = yAxis_1,
                   z=TI_4_images,
                   zmin=0,
                   zmax=3000,
                   colorscale='gray',
                   colorbar={"title": 'T<sub>1</sub> (ms)',
                             'titlefont': dict(
                                   family='Times New Roman',
                                   size=26,
                                   color='white'
                                   )
                            },
                   xaxis='x2',
                   yaxis='y2',
                   visible=False)

data=[trace1, trace2, trace3, trace4, trace5]

updatemenus = list([
    dict(active=0,
         x = 0.4,
         xanchor = 'left',
         y = -0.15,
         yanchor = 'bottom',
         direction = 'up',
         font=dict(
                family='Times New Roman',
                size=16
            ),
         buttons=list([   
            dict(label = 'T<sub>1</sub> maps',
                 method = 'update',
                 args = [{'visible': [True, False, False, False, False],
                          'showscale': True,},
                         ]),
            dict(label = 'TI = 50 ms',
                 method = 'update',
                 args = [
                            {
                            'visible': [False, True, False, False, False],
                            'showscale': True,},
                           ]),
            dict(label = 'TI = 400 ms',
                 method = 'update',
                 args = [{'visible': [False, False, True, False, False],
                            'showscale': True,},
                           ]),
            dict(label = 'TI = 1100 ms',
                 method = 'update',
                 args = [{'visible': [False, False, False, True, False],
                            'showscale': True,},
                           ]),
            dict(label = 'TI ~ 2500 ms',
                 method = 'update',
                 args = [{'visible': [False, False, False, False, True],
                            'showscale': True,},
                           ]),
    ])
    )
])

layout = dict(
    width=960,
    height=649,
    margin = dict(
                t=40,
                r=50,
                b=10,
                l=50),
    xaxis = dict(range = [0,256*3-1], autorange = False,
             showgrid = False, zeroline = False, showticklabels = False,
             ticks = '', domain=[0, 1]),
    yaxis = dict(range = [0,256*2-1], autorange = False,
             showgrid = False, zeroline = False, showticklabels = False,
             ticks = '', domain=[0, 1]),
    xaxis2 = dict(range = [0,256*3-1], autorange = False,
             showgrid = False, zeroline = False, showticklabels = False,
             ticks = '', domain=[0, 1]),
    yaxis2 = dict(range = [0,256*2-1], autorange = False,
             showgrid = False, zeroline = False, showticklabels = False,
             ticks = '', domain=[0, 1], anchor='x2'),
    showlegend = False,
    autosize = False,
    updatemenus=updatemenus
)


fig = dict(data=data, layout=layout)

#iplot(fig, filename = 'basic-heatmap', config = config)
plot(fig, filename = 'figure2.html', config = config)
display(HTML('figure2.html'))

Figure 4 Example T1 maps that were generated from submitted data. Note the differences in acquisitions (e.g. FOV (top middle), orientation (bottom right, gradient distortion correction (top left and right) and resulting artifacts in the T1 maps (e.g. ghosting (bottom left), ringing (bottom middle), noise profiles (top left and bottom right), deformation/slice mispositioning (top right)) resulting from the independently-implemented acquisition protocols.

3.2     |     Phantom#

An overview of the T1 results for the submitted ISMRM/NIST phantom datasets is displayed in Figure 5. The same data is presented in each column with different axes types (linear, log, and error) to better visualize the results. Figure 5-a shows good agreement (slope = 0.98, intercept = -14 ms) for this dataset in comparison to the reference T1 values. However, this trend did not persist for low T1 values (T1 smaller than 100-200 ms), as seen in the log-log plot (Figure 5-b), which was expected because the imaging protocol is optimized for human water-based T1 values (T1 higher than 500 ms). Errors exceeding 10% are observed for T1 values of phantom spheres below this threshold (Figure 5-c). These trends are observed for the entire-dataset plots as well (Figure 5 d-f). More variability is seen in Figure 5-d around the identity diagonal at very high T1 (T1 ~ 2000 ms) than towards the WM-GM values (T1 ~ 600-1400 ms), which is less apparent in the log-log plot (Figure 5-e). In addition to the low T1 values exceeding the 10% error threshold (Figure 5-f), a few measurements with outlier values exceeding 10% error (~3-4) were observed in the human water-based tissue range (~500-2000 ms).

from os import path
import os

if build == 'latest':
    if path.isdir('analysis')== False:
        !git clone https://github.com/rrsg2020/analysis.git
        dir_name = 'analysis'
        analysis = os.listdir(dir_name)

        for item in analysis:
            if item.endswith(".ipynb"):
                os.remove(os.path.join(dir_name, item))
            if item.endswith(".md"):
                os.remove(os.path.join(dir_name, item))
elif build == 'archive':
    if os.path.isdir(Path('../../data')):
        data_path = ['../../data/rrsg-2020-neurolibre']
    else:
        # define data requirement path
        data_req_path = os.path.join("..", "binder", "data_requirement.json")
        # download data
        repo2data = Repo2Data(data_req_path)
        data_path = repo2data.install()

# Imports
import warnings
warnings.filterwarnings("ignore")

from pathlib import Path
import pandas as pd
import json
import nibabel as nib
import numpy as np

from analysis.src.database import *
from analysis.src.nist import get_reference_NIST_values, get_NIST_ids
from analysis.src.tools import calc_error
from analysis.src.nist import temperature_correction

import matplotlib.pyplot as plt
plt.style.use('analysis/custom_matplotlibrc')
plt.rcParams["figure.figsize"] = (10,10)
fig_id = 0

if build == 'latest':
    database_path = Path('analysis/databases/3T_NIST_T1maps_database.pkl')
    output_folder = Path("analysis/plots/03_singledataset_scatter_NIST-temperature-corrected/")
elif build=='archive':
    database_path = Path(data_path[0] + '/analysis/databases/3T_NIST_T1maps_database.pkl')
    output_folder = Path(data_path[0] + '/analysis/plots/03_singledataset_scatter_NIST-temperature-corrected/')

estimate_type = 'mean' # median or mean

## Define Functions
def plot_single_scatter(x, y, y_std,
                        title, x_label, y_label,
                        file_prefix, folder_path, fig_id,
                        y_type='linear'):
    if y_type is 'linear':
        plt.errorbar(x,y, y_std, fmt='o', solid_capstyle='projecting')
        ax = plt.gca()
        ax.axline((1, 1), slope=1, linestyle='dashed')
        ax.set_ylim(ymin=0, ymax=2500)
        ax.set_xlim(xmin=0, xmax=2500)
    if y_type is 'log':
        plt.loglog(x,y,'o')
        ax = plt.gca()
        ax.set_ylim(ymin=20, ymax=2500)
        ax.set_xlim(xmin=20, xmax=2500)
    if y_type is 'error_t1':
        plt.errorbar(x,calc_error(x,y), fmt='o')
        ax = plt.gca()
        ax.axline((1, 0), slope=0, color='k')
        ax.axline((1, -10), slope=0, linestyle='dashed', color='k')
        ax.axline((1, 10), slope=0, linestyle='dashed', color='k')
        ax.set_ylim(ymin=-100, ymax=100)
        ax.set_xlim(xmin=0, xmax=2500)


    
    plt.title(title)
    plt.xlabel(x_label)
    plt.ylabel(y_label)

    fig = plt.gcf()
    

    folder_path.mkdir(parents=True, exist_ok=True)

    if fig_id<10:
        filename = "0" + str(fig_id) + "_" + file_prefix
    else:
        filename = str(fig_id) + "_" + file_prefix

    fig.savefig(folder_path / (str(filename) + '.svg'), facecolor='white')
    fig.savefig(folder_path / (str(filename) + '.png'), facecolor='white')
    fig_id = fig_id + 1
    plt.show()
    return fig_id

## Load database

df = pd.read_pickle(database_path)

## Initialize array

dataset_estimate = np.array([])
dataset_std = np.array([])

index = 6.001

serial_number = df.loc[index]['phantom serial number']


for key in get_NIST_ids():
    if estimate_type == 'mean':
        dataset_estimate = np.append(dataset_estimate, np.mean(df.loc[index][key]))
    elif estimate_type == 'median':
        dataset_estimate = np.append(dataset_estimate, np.median(df.loc[index][key]))
    else:
        Exception('Unsupported dataset estimate type.')

    dataset_std = np.append(dataset_std, np.std(df.loc[index][key]))

ref_values = get_reference_NIST_values(serial_number)

temperature = df.loc[index]['phantom temperature']
temp_corrected_ref_values = temperature_correction(temperature, serial_number)


# PYTHON CODE
# Module imports

import matplotlib.pyplot as plt
from PIL import Image
from matplotlib.image import imread
import scipy.io
import plotly.graph_objs as go
import numpy as np
from plotly import __version__
from plotly.offline import init_notebook_mode, iplot, plot
config={'showLink': False, 'displayModeBar': False}

init_notebook_mode(connected=True)

from IPython.display import display, HTML

import os
import markdown
import random
from scipy.integrate import quad

import warnings
warnings.filterwarnings('ignore')
config={'showLink': False, 'displayModeBar': False}

lin_line = go.Scatter(
    x=np.linspace(0,2500),
    y=np.linspace(0,2500),
    name="slope",
    line_shape='linear',
    line={'dash': 'dash','color': 'black'},
    visible = True,
    showlegend = False
)

data_lin=go.Scatter(
    x=temp_corrected_ref_values,
    y=dataset_estimate,
    error_y=dict(
        type='data', # value of error bar given in data coordinates
        array=dataset_std,
        visible=True),
    name = 'id: '+ str(index),
    mode = 'markers',
    marker=dict(color='#007ea7'),
    visible = True,
    showlegend = False,
    )

data_log=go.Scatter(
    x=temp_corrected_ref_values,
    y=dataset_estimate,
    name = 'id: '+ str(index),
    mode = 'markers',
    marker=dict(color='#007ea7'),
    visible = True,
    showlegend = False
    )
data_error=go.Scatter(
    x=temp_corrected_ref_values,
    y= calc_error(temp_corrected_ref_values,dataset_estimate),
    name = 'id: '+ str(index),
    mode = 'markers',
    marker=dict(color='#007ea7'),
    visible = True,
    showlegend = False
    )
err_solid_line = go.Scatter(
    x=np.linspace(0,2500),
    y=np.linspace(0,2500)*0,
    name="0 % error line",
    line_shape='linear',
    line={'dash': 'solid','color': 'black'},
    visible = True,
    showlegend = False
)
err_p10_line = go.Scatter(
    x=np.linspace(0,2500),
    y=np.linspace(0,2500)*0+10,
    name="+10% error",
    line_shape='linear',
    line={'dash': 'dash','color': 'black'},
    visible = True,
    showlegend = False
)
err_m10_line = go.Scatter(
    x=np.linspace(0,2500),
    y=np.linspace(0,2500)*0-10,
    name="-10% error",
    line_shape='linear',
    line={'dash': 'dash','color': 'black'},
    visible = True,
    showlegend = False
)

data = [lin_line, data_lin, data_log, data_error, err_solid_line, err_p10_line, err_m10_line]

from plotly.subplots import make_subplots
import plotly.graph_objects as go

fig = make_subplots(rows=2, cols=3, horizontal_spacing = 0.08)

fig.add_trace(
    lin_line,
    row=1, col=1
)

fig.add_trace(
    data_lin,
    row=1, col=1
)

fig.add_trace(
    data_log,
    row=1, col=2
)

fig.add_trace(
    data_error,
    row=1, col=3
)

fig.add_trace(
    err_solid_line,
    row=1, col=3
)

fig.add_trace(
    err_p10_line,
    row=1, col=3
)

fig.add_trace(
    err_m10_line,
    row=1, col=3
)

output_folder = Path("analysis/plots/04_alldatasets_scatter_NIST-temperature-corrected/")

## Initialize array

dataset_mean = np.zeros((1,14))
dataset_std = np.zeros((1,14))
version = np.array([])
temperature = np.array([])
ref_values = np.zeros((1,14))


ii=0
for index, row in df.iterrows():
    if type(df.loc[index]['T1 - NIST sphere 1']) is np.ndarray:

        version = np.append(version,df.loc[index]['phantom serial number'])
        temperature = np.append(temperature, df.loc[index]['phantom temperature'])


        if version[ii] is None:
            version[ii] = 999 # Missing version, only known case is one where we have version > 42 right now.
        
        if temperature[ii] is None:
            temperature[ii] = 20 # Missing temperature, assume it to be 20C (reference temperature).
            
            
        if ii==0:
            ref_values = get_reference_NIST_values(version[ii])
            temp_corrected_ref_values = temperature_correction(temperature[ii], version[ii])
        else:
            ref_values = np.vstack((ref_values, get_reference_NIST_values(version[ii])))
            temp_corrected_ref_values = np.vstack((temp_corrected_ref_values, temperature_correction(temperature[ii], version[ii])))
        
        tmp_dataset_estimate = np.array([])
        tmp_dataset_std = np.array([])

        for key in get_NIST_ids():
            if estimate_type is 'mean':
                tmp_dataset_estimate = np.append(tmp_dataset_estimate, np.mean(df.loc[index][key]))
            elif estimate_type is 'median':
                tmp_dataset_estimate = np.append(tmp_dataset_estimate, np.median(df.loc[index][key]))
            else:
                Exception('Unsupported dataset estimate type.')

            tmp_dataset_std = np.append(tmp_dataset_std, np.std(df.loc[index][key]))

        if ii==0:
            dataset_estimate = tmp_dataset_estimate  
            dataset_std = tmp_dataset_std
        else:
            dataset_estimate = np.vstack((dataset_estimate, tmp_dataset_estimate))
            dataset_std = np.vstack((dataset_std, tmp_dataset_std))

        ii=ii+1

## Setup for plots
fig_id = 0
dims=ref_values.shape
file_prefix = 'alldatasets'

fig.add_trace(
    lin_line,
    row=2, col=1
)

for ii in range(dims[0]):
    data_lin=go.Scatter(
        x=temp_corrected_ref_values[ii,:],
        y=dataset_estimate[ii,:],
        error_y=dict(
            type='data', # value of error bar given in data coordinates
            array=dataset_std[ii,:],
            visible=True),
        name = 'id: '+ str(df.index[ii]),
        mode = 'markers',
        visible = True,
        showlegend = False,
        )

    fig.add_trace(
        data_lin,
        row=2, col=1
    )


for ii in range(dims[0]):
    data_log=go.Scatter(
        x=temp_corrected_ref_values[ii,:],
        y=dataset_estimate[ii,:],
        name = 'id: '+ str(df.index[ii]),
        mode = 'markers',
        visible = True,
        showlegend = False
        )

    fig.add_trace(
        data_log,
        row=2, col=2
    )

for ii in range(dims[0]):
    data_error=go.Scatter(
        x=temp_corrected_ref_values[ii,:],
        y= calc_error(temp_corrected_ref_values[ii,:],dataset_estimate[ii,:]),
        name = 'id: '+ str(df.index[ii]),
        mode = 'markers',
        visible = True,
        showlegend = False
        )

    fig.add_trace(
        data_error,
        row=2, col=3
    )

fig.add_trace(
    err_solid_line,
    row=2, col=3
)

fig.add_trace(
    err_p10_line,
    row=2, col=3
)

fig.add_trace(
    err_m10_line,
    row=2, col=3
)

fig.update_xaxes(
    type="linear",
    range=[0,2500],
    title='Reference T<sub>1</sub> (ms)',
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = 0,
    dtick = 500,
    linecolor='black',
    linewidth=2,
    row=1, col=1
    )
fig.update_yaxes(
    type="linear",
    range=[0,2500],
    title={
        'text':'T<sub>1</sub> estimate (ms)',
        'standoff':0
        },
    showgrid=True,
    tick0 = 0,
    dtick = 500,
    gridcolor='rgb(169,169,169)',
    linecolor='black',
    linewidth=2,
    row=1, col=1
    )

fig.update_xaxes(
    type="log",
    range=[np.log10(20),np.log10(2500)],
    title='Reference T<sub>1</sub> (ms)',
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    minor=dict(ticks="inside", ticklen=6, showgrid=True),
    linecolor='black',
    linewidth=2,
    row=1, col=2
    )
fig.update_yaxes(
    type="log",
    range=[np.log10(20),np.log10(2500)],
    title={
        'text':'T<sub>1</sub> estimate (ms)',
        'standoff':0
        },
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    minor=dict(ticks="inside", ticklen=6, showgrid=True),
    linecolor='black',
    linewidth=2,
    row=1, col=2)

fig.update_xaxes(
    type="linear",
    range=[0,2500],
    title='Reference T<sub>1</sub> (ms)',
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = 0,
    dtick = 500,
    linecolor='black',
    linewidth=2,
    row=1, col=3)

fig.update_yaxes(
    type="linear",
    range=[-100,100],
    title={
        'text':'Error (%)',
        'standoff':0
        },
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = -100,
    dtick = 25,
    linecolor='black',
    linewidth=2,
    row=1, col=3
    )


fig.update_xaxes(
    type="linear",
    range=[0,2500],
    title='Reference T<sub>1</sub> (ms)',
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = 0,
    dtick = 500,
    linecolor='black',
    linewidth=2,
    row=2, col=1
    )
fig.update_yaxes(
    type="linear",
    range=[0,2500],
    title={
        'text':'T<sub>1</sub> estimate (ms)',
        'standoff':0
        },
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = 0,
    dtick = 500,
    linecolor='black',
    linewidth=2,
    row=2, col=1
    )

fig.update_xaxes(
    type="log",
    range=[np.log10(20),np.log10(2500)],
    title='Reference T<sub>1</sub> (ms)',
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    minor=dict(ticks="inside", ticklen=6, showgrid=True),
    linecolor='black',
    linewidth=2,
    row=2, col=2
    )
fig.update_yaxes(
    type="log",
    range=[np.log10(20),np.log10(2500)],
    title={
        'text':'T<sub>1</sub> estimate (ms)',
        'standoff':0
        },
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    minor=dict(ticks="inside", ticklen=6, showgrid=True),
    linecolor='black',
    linewidth=2,
    row=2, col=2)

fig.update_xaxes(
    type="linear",
    range=[0,2500],
    title='Reference T<sub>1</sub> (ms)',
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = 0,
    dtick = 500,
    linecolor='black',
    linewidth=2,
    row=2, col=3)

fig.update_yaxes(
    type="linear",
    range=[-100,100],
    title={
        'text':'Error (%)',
        'standoff':0
        },
    showgrid=True,
    gridcolor='rgb(169,169,169)',
    tick0 = -100,
    dtick = 25,
    linecolor='black',
    linewidth=2,
    row=2, col=3,
    )

fig.update_layout(
    margin=dict(l=30, r=30, t=10, b=30),
    paper_bgcolor='rgb(255, 255, 255)',
    plot_bgcolor='rgb(255, 255, 255)',
    legend_title="",
    annotations=[
        dict(
            x=-0.05,
            y=0.53,
            showarrow=False,
            text='<b>a</b>',
            font=dict(
                family='Times New Roman',
                size=48
            ),
            xref='paper',
            yref='paper'
        ),
        dict(
            x=0.32,
            y=0.53,
            showarrow=False,
            text='<b>b</b>',
            font=dict(
                family='Times New Roman',
                size=48
            ),
            xref='paper',
            yref='paper'
        ),
        dict(
            x=0.7,
            y=0.53,
            showarrow=False,
            text='<b>c</b>',
            font=dict(
                family='Times New Roman',
                size=48
            ),
            xref='paper',
            yref='paper'
        ),
        dict(
            x=-0.05,
            y=-0.11,
            showarrow=False,
            text='<b>d</b>',
            font=dict(
                family='Times New Roman',
                size=48
            ),
            xref='paper',
            yref='paper'
        ),
        dict(
            x=0.32,
            y=-0.11,
            showarrow=False,
            text='<b>e</b>',
            font=dict(
                family='Times New Roman',
                size=48
            ),
            xref='paper',
            yref='paper'
        ),
        dict(
            x=0.7,
            y=-0.11,
            showarrow=False,
            text='<b>f</b>',
            font=dict(
                family='Times New Roman',
                size=48
            ),
            xref='paper',
            yref='paper'
        ),
        ]
    )

fig.update_layout(height=600, width=960)

#iplot(fig, filename = 'figure3', config = config)
plot(fig, filename = 'figure3.html', config = config)
display(HTML('figure3.html'))
---- repo2data starting ----
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/repo2data
Config from file :
../binder/data_requirement.json
Destination:
../data/rrsg-2020-neurolibre

Info : ../data/rrsg-2020-neurolibre already downloaded

Figure 5 Measured mean T1 values vs. temperature-corrected NIST reference values of the phantom spheres presented as linear plots (a,d), log-log plots (b,e), and plots of the error relative to reference T1 value. Plots (a–c) are of an example single dataset (T1 map dataset 6-1-1 in Figure 3-a), whereas plots (d–f) are of all acquired datasets. The dashed lines in plots (c,f) represent a ±10 % error.

Inter-submission coefficients of variation (CoV) were calculated by selecting ad hoc one single T1 map submitted per challenge participant 2 and calculating the CoV of the T1 means per sphere. The average inter-submission CoV across the first five spheres representing the expected T1 value range in the human brain (~ 500 to ~ 2000 ms) was 6.1 % (sphere 1 = 4.7 %, sphere 2 = 3.1 %, sphere 3 = 6.3 %, sphere 4 = 12.8 %, sphere 5 = 7.3 %). Two sites were clear outliers that had particular issues for sphere 4, likely because the signal null for this particular T1 value (~600 ms) is near the second TI and these outliers incorrectly flip the signal at the near-null TI during magnitude-only data fitting 3; by removing these outliers, the mean inter-submission CoV reduces to 4.1 % (sphere 1 = 5.4 %, sphere 2 = 3. 5%, sphere 3 = 2.5 %, sphere 4 = 4.2 %, sphere 5 = 4.9 %). One participant measured T1 maps with one phantom using one implemented protocol (identical imaging parameters) at 7 different sites on systems from a single manufacturer at 3T, and so a mean intra-submission CoV across the first five spheres for this case was calculated to be 2.9 % (sphere 1 = 4.9 %, sphere 2 = 3.5 %, sphere 3 = 2.6 %, sphere 4 = 2.0 %, sphere 5 = 1.6 %).

Figure 6 Scatter plot comparing complex and magnitude-only fitted data across the first five spheres representing the expected T1 value range in the human brain (~500 to ~2000 ms). The markers are color-coded based on the implementation site, while their size represents the percent difference calculated between the two fitting methods (magnitude or complex) for that datapoint.

Figure 6 compares the mean T1 values measured using complex and magnitude-only data for the 11 datasets where authors provided both in their submissions. Note that these datasets are from the same acquisition, not two unique acquisitions. The scatter plot shows that for the range of T1 values expected in the brain (T1 > 500 ms), there is almost no difference in fitted T1 values between the two types of data (the highest outlier indicates ~9 ms difference). However, for T1 values less than ~250 ms large errors 4 were present, likely due to the data acquisition imaging protocol (specifically, TI range) being sub-optimal for this range of T1 values.

Figure 7 Hierarchical analysis of T1 estimation error across voxel-wise distributions in spheres 1-6, using 20 measurements split into 9 quantiles (q1-q9). Each panel shows individual shift functions for each measurement (colored by vendor) in the top row, which characterize the percent measurement error as either overestimation or underestimation. The bottom row in each panel (black markers) displays the average trend of bootstrapped differences at each decile in milliseconds. The trends highlight any notable common patterns at the respective decile, such as a 39 ms median underestimation trend in Sphere 3. Straight lines in the top row indicate a homogeneous measurement error across voxels. High-density intervals not intersecting the zero crossing indicate a significant trend at the respective decile.

​ ​​The direction of the measurement error in the phantom is influenced by both the measurement site and the reference value, as indicated by the individual shift functions (Figure 7). For example, at sphere 1 (~2000 ms), nearly half of the measurements (20 shown in total) are positioned on each side of the zero-crossing. On the other hand, for sphere 3 (~1s), nearly all the measurements show underestimation as shift functions are located below the zero-crossing. Bootstrapped differences capture these trends, indicating a dominant overestimation at sphere 1 (median difference of +17 ms) and underestimation at sphere 3 (median difference of -39 ms). High-density intervals associated with these median differences do not indicate a common pattern for the former (intervals cross zero), whereas they reveal a notable underestimation trend at sphere 3 (intervals do not include zero). A similar common pattern is also observed for sphere 2 (median overestimation of 35 ms). In addition, the shape of individual shift functions conveys information about how voxel-wise distributions differ. For example, some outliers are observed with these graphs (two by looking at sphere 4). After looking at the T1 maps, these were not due to mispositioning of ROIs but due to the acquired T1 mapping data not fitting well for all spheres relative to the reference values. This produced the worse agreement for sphere 4 (these outliers for sphere 4 can also be seen in Figure 5-f). Lastly, the spread of shift functions around the zero-crossing does not indicate vendor-specific clustering for the selected measurements and reference values.

3.3     |     Human#

Figure 8 summarizes the results from human datasets submitted to this challenge, showing mean and standard deviation T1 values from the WM (genu) and GM (cerebral cortex) ROIs. The top plot collapses all datasets for each site, while the bottom plot shows each dataset separately. Mean WM T1 values across all submissions was 828 ± 38 ms in the genu and 854 ± 50 ms in the splenium, and mean GM T1 values were 1548 ± 156 ms in the cortex and 1188 ± 133 ms in the deep GM, with less variations overall in WM compared to GM, possibly due to better ROI placement and less partial voluming in WM. Inter-submission coefficients of variation (CoV) for independently-implemented imaging protocols were calculated using one T1 map measurement per submission that most closely matched the proposed protocol, and were 6.0% for genu, 11% for splenium, 16% for cortical GM and 22% for deep GM. One site (site 9) measured multiple subjects on three scanners using two different vendors, and so intra-submission CoVs for these centrally-implemented protocols were calculated over acquired T1 maps from this site, and were 2.9% for genu, 3.5% for splenium, 6.9 % for cortical GM and 7.8% for deep GM. This particular acquisition had an ideal slice positioning, cutting through the AC-PC line and the genu for proper ROI placement, particularly for the corpus callosum and deep GM (Supplementary Figure 1 - top left).

from os import path
import os

if build == 'latest':
    if path.isdir('analysis')== False:
        !git clone https://github.com/rrsg2020/analysis.git
        dir_name = 'analysis'
        analysis = os.listdir(dir_name)

        for item in analysis:
            if item.endswith(".ipynb"):
                os.remove(os.path.join(dir_name, item))
            if item.endswith(".md"):
                os.remove(os.path.join(dir_name, item))
elif build == 'archive':
    if os.path.isdir(Path('../../data')):
        data_path = ['../../data/rrsg-2020-neurolibre']
    else:
        # define data requirement path
        data_req_path = os.path.join("..", "binder", "data_requirement.json")
        # download data
        repo2data = Repo2Data(data_req_path)
        data_path = repo2data.install()         

# Imports
import warnings
warnings.filterwarnings("ignore")

from pathlib import Path
import pandas as pd
import nibabel as nib
import numpy as np

from analysis.src.database import *
import matplotlib.pyplot as plt
plt.style.use('analysis/custom_matplotlibrc')
plt.rcParams["figure.figsize"] = (20,5)
fig_id = 0

# Configurations
if build == 'latest':
    database_path = Path('analysis/databases/3T_human_T1maps_database.pkl')
    output_folder = Path("analysis/plots/08_wholedataset_scatter_Human/")
elif build=='archive':
    database_path = Path(data_path[0] + '/analysis/databases/3T_human_T1maps_database.pkl')
    output_folder = Path(data_path[0] + '/analysis/plots/08_wholedataset_scatter_Human/')

estimate_type = 'mean' # median or mean

# Define functions

def plot_both_scatter(x1, x2, y, y_std,
                      title, x1_label, x2_label, y_label,
                      file_prefix, folder_path, fig_id):
    
    plt.rcParams["figure.figsize"] = (20,10)

    fig, axs = plt.subplots(2)
    fig.suptitle(title)
    axs[0].errorbar(x1, y, y_std, fmt='o', solid_capstyle='projecting')
    axs[0].set_xlabel(x1_label)
    axs[0].set_ylabel(y_label)
    axs[0].set_xticks(np.arange(0, np.max(x1), step=1))


    axs[1].errorbar(x2, y, y_std, fmt='o', solid_capstyle='projecting')
    axs[1].set_xlabel(x2_label)
    axs[1].set_ylabel(y_label)
    axs[1].set_xticklabels(labels=x2, rotation=90)


    if fig_id<10:
        filename = "0" + str(fig_id) + "_" + file_prefix
    else:
        filename = str(fig_id) + "_" + file_prefix

    fig.savefig(folder_path / (str(filename) + '.svg'), facecolor='white')
    fig.savefig(folder_path / (str(filename) + '.png'), facecolor='white')
    fig_id = fig_id + 1
    plt.show()
    return fig_id

# Load database

df = pd.read_pickle(database_path)

genu_estimate = np.array([])
genu_std = np.array([])
splenium_estimate = np.array([])
splenium_std = np.array([])
deepgm_estimate = np.array([])
deepgm_std = np.array([])
cgm_estimate = np.array([])
cgm_std = np.array([])

ii = 0
for index, row in df.iterrows():
    
    if estimate_type is 'mean':
        genu_estimate = np.append(genu_estimate, np.mean(df.loc[index]['T1 - genu (WM)']))
        genu_std = np.append(genu_std, np.std(df.loc[index]['T1 - genu (WM)']))
        splenium_estimate = np.append(splenium_estimate, np.mean(df.loc[index]['T1 - splenium (WM)']))
        splenium_std = np.append(splenium_std, np.std(df.loc[index]['T1 - splenium (WM)']))
        deepgm_estimate = np.append(deepgm_estimate, np.mean(df.loc[index]['T1 - deep GM']))
        deepgm_std = np.append(deepgm_std, np.std(df.loc[index]['T1 - deep GM']))
        cgm_estimate = np.append(cgm_estimate, np.mean(df.loc[index]['T1 - cortical GM']))
        cgm_std = np.append(cgm_std, np.std(df.loc[index]['T1 - cortical GM']))
    elif estimate_type is 'median':
        genu_estimate = np.append(genu_estimate, np.median(df.loc[index]['T1 - genu (WM)']))
        genu_std = np.append(genu_std, np.std(df.loc[index]['T1 - genu (WM)']))
        splenium_estimate = np.append(splenium_estimate, np.median(df.loc[index]['T1 - splenium (WM)']))
        splenium_std = np.append(splenium_std, np.std(df.loc[index]['T1 - splenium (WM)']))
        deepgm_estimate = np.append(deepgm_estimate, np.median(df.loc[index]['T1 - deep GM']))
        deepgm_std = np.append(deepgm_std, np.std(df.loc[index]['T1 - deep GM']))
        cgm_estimate = np.append(cgm_estimate, np.median(df.loc[index]['T1 - cortical GM']))
        cgm_std = np.append(cgm_std, np.std(df.loc[index]['T1 - cortical GM']))
    else:
        Exception('Unsupported dataset estimate type.')
    ii = ii +1

# Store the IDs
indexes_numbers = df.index
indexes_strings = indexes_numbers.map(str)

x1_label='Site #'
x2_label='Site #.Meas #'
y_label="T$_1$ (ms)"
file_prefix = 'WM_and_GM'
folder_path=output_folder

x1=indexes_numbers
x2=indexes_strings
y=genu_estimate
y_std=genu_std

# Paper formatting of x tick labels (remove leading zero, pad zero at the end for multiples of 10)
x3=[]
for num in x2:
    x3.append(num.replace('.0', '.'))

index=0
for num in x3:
    if num[-3] != '.':
        x3[index]=num+'0'
    index+=1

Figure 8 Mean T1 values in two sets of ROIs, white matter (one 5⨯5 voxel ROI, genu) and gray matter (three 3⨯3 voxel ROIs, cortex). Top figure shows all datasets collapsed into sites, whereas the bottom shows each individual dataset.

4     |     DISCUSSION#

4.1     |     Achievements of the challenge#

The challenge focused on exploring the reproducibility of the gold standard inversion recovery T1 mapping method reported in a seminal paper [Barral et al., 2010]. Eighteen submissions independently implemented the inversion recovery T1 mapping acquisition protocol as outlined in Barral et al. [2010](which is optimized for the T1 values observed in brain tissue), and reported T1 mapping data in a standard quantitative MRI phantom and/or human brains at 27 MRI sites, using systems from three different vendors (GE, Philips, Siemens). The collaborative effort produced an open-source database of 94 T1 mapping datasets, including 38 ISMRM/NIST phantom and 56 human brain datasets. A standardized T1 processing pipeline was developed for different dataset types, including magnitude-only and complex data. Additionally, Jupyter notebooks that can be executed in containerized environments were developed for quality assurance, visualization, and analyses. An interactive web-based dashboard was also developed to allow for easy exploration of the challenge results in a web-browser.

To evaluate the accuracy of the resulting T1 values, the challenge used the standard ISMRM/NIST phantom with fiducial spheres having T1 values in the range of human brain tissue, from 500 to 2000 ms (see Figure 5). As anticipated for this protocol, there was a decrease in the accuracy in measurements for spheres with T1 below 300 ms. Overall, the majority of the independently implemented imaging protocols from various sites are consistent with the temperature-corrected reference values, with only a few exceptions. Using the NIST phantom, we report that sites that independently implemented the imaging protocol resulted in an inter-submission mean CoV (6.1 %) that was twice as high as the intra-submission mean CoV measured at seven sites (2.9 %). A similar trend was observed in vivo. Inter-submission CoV for WM (genu) was 6.0 % and for GM (cortex) was 16.5 % vs the intra-submission CoV that was 2.9 % and 6.9%, with generally higher CoVs relative to the phantom measurements likely due to biological variability [Piechnik et al., 2013, Stanisz et al., 2005].

4.2     |     Comparison with other studies#

The work done during this challenge involved a multi-center quantitative T1 mapping study using the NIST phantom across various sites. This work overlaps with two recent studies [Bane et al., 2018, Keenan et al., 2021]. Bane et al. [2018] focused on the reproducibility of two standard quantitative T1 techniques (inversion recovery and variable flip angle) and a wide variety of site-specific T1 mapping protocols for DCE, mostly VFA protocols with fewer flip angles, which were implemented at eight imaging centers covering the same 3 MRI vendors featured in this challenge (GE/Philips/Siemens). The inter-platform coefficient of variation for the standard inversion recovery T1 protocol was 5.46% at 3 T in [Bane et al., 2018], which was substantially lower than what they observed for their standard VFA protocol (22.87%). However, Bane et al.’s work differed from the challenge in several ways. First, the standard imaging protocol for inversion recovery used by Bane et al. [2018] had more inversion times (14 compared to the challenge’s 4) to cover the entire range of T1 values of the phantom. Secondly, Bane et al. [2018] used a single traveling phantom for all sites, whereas the challenge used a total of 8 different phantoms (some were shared amongst people who participated independently). Thirdly, Bane et al. [2018] averaged the signals within each ROI of each sphere prior to fitting for the T1 values, whereas the challenge pipeline fits the T1 values on a per-voxel basis and only subsequently calculates the mean/median/std. They also only acquired magnitude data, in contrast to the challenge where participants were encouraged to submit both complex and magnitude-only data. Lastly, in Bane et al. [2018], the implementations of the common inversion recovery protocols were fully standardized (full protocol) across all the platforms (except for two cases where one manufacturer couldn’t achieve the lowest TI) and imposed and coordinated by the principal researchers. In contrast, the challenge sought to explore the variations that would occur for a less-restricted protocol (Table 2) that is independently-implemented at multiple centers, which more closely emulates the quantitative MR research flow (publication of a technique and protocol → independently implement the pulse sequence and/or protocol → use the new implementation independently in a study → publish). Of note, in the challenge, one participating group coordinated a large multicenter dataset that mirrors the study by Bane et al. [2018] by imaging a single phantom across 7 different imaging sites, albeit doing so on a single manufacturer. Using this subset, the mean cross-site CoV was 2.9 % (range: 1.6 - 4.9 %) for the first five spheres, which is in agreement with the range of observations for all spheres by Bane et al. (Bane et al. 2018) at 3T using their full inversion recovery protocol (CoV = 5.46 %; range: 0.99 - 14.6 %).

Another study by Bane et al. [2018], Keenan et al. [2021] also investigated the accuracy of T1 mapping techniques using a single ISMRM/NIST system phantom at multiple sites and on multiple platforms. Like Bane et al. [2018] they used an inversion recovery imaging protocol optimized for the full range of T1 values represented in the ISMRM/NIST phantom, which consisted of 9 to 10 inversion times and a TR of 4500 ms (TR ~ 5T1 of WM at 3T). They reported no consistent pattern of differences in measured inversion recovery T1 values across MRI vendors for the two T1 mapping techniques they used (inversion recovery and VFA). They observed relative errors between their T1 measurements and the reference values of the phantom to be below 10% for all T1 values and the larger errors were observed at the lowest and highest T1 values of the phantom.

4.3     |     Lessons Learned and Future Directions#

There are some important things to note about this challenge. Firstly, the submissions for this challenge were due in March 2020, which was impacted by the COVID-19 pandemic lockdowns, thereby reducing repeated experiments due to access limitations. Nevertheless, a substantial number of participants submitted their datasets. Some groups intended on acquiring more data, and others intended on re-scanning volunteers, but could no longer do so due to local pandemic restrictions.

This reproducibility challenge aimed to compare differences between independently-implemented protocols. Crowning a winner was not an aim of this challenge, due to concerns that participants would have changed their protocols to get closer to the reference T1 values, leading to a broader difference in protocol implementations across MRI sites. Instead, we focused on building consensus by creating an open data repository, sharing reproducible workflows, and presenting the results through interactive visualizations. Future work warrants the study of inter-site differences in a vendor-neutral workflow [Karakuzu et al., 2022] by adhering to the latest Brain Imaging Data Structure (BIDS) community data standard on qMRI [Karakuzu et al., 2022].

5     |     CONCLUSION#

The 2020 Reproducibility Challenge, jointly organized by the Reproducible Research and Quantitative MR ISMRM study groups, led to the creation of a large open database of standard quantitative MR phantom and human brain inversion recovery T1 maps. These maps were measured using independently implemented imaging protocols on MRI scanners from three different manufacturers. All collected data, processing pipeline code, computational environment files, and analysis scripts were shared with the goal of promoting reproducible research practices, and an interactive dashboard was developed to broaden the accessibility and engagement of the resulting datasets (https://rrsg2020.dashboards.neurolibre.org). The differences in stability between independently implemented (inter-submission) and centrally shared (intra-submission) protocols observed both in phantoms and in vivo could help inform future meta-analyses of quantitative MRI metrics [Lazari and Lipp, 2021, Mancini et al., 2020] and better guide multi-center collaborations.

ACKNOWLEDGEMENT#

The conception of this collaborative reproducibility challenge originated from discussions with experts, including Paul Tofts, Joëlle Barral, and Ilana Leppert, who provided valuable insights. Additionally, Kathryn Keenan, Zydrunas Gimbutas, and Andrew Dienstfrey from NIST provided their code to generate the ROI template for the ISMRM/NIST phantom. Dylan Roskams-Edris and Gabriel Pelletier from the Tanenbaum Open Science Institute (TOSI) offered valuable insights and guidance related to data ethics and data sharing in the context of this international multi-center conference challenge. The 2020 RRSG study group committee members who launched the challenge, Martin Uecker, Florian Knoll, Nikola Stikov, Maria Eugenia Caligiuri, and Daniel Gallichan, as well as the 2020 qMRSG committee members, Kathryn Keenan, Diego Hernando, Xavier Golay, Annie Yuxin Zhang, and Jeff Gunter, also played an essential role in making this challenge possible. Finally, we extend our thanks to all the volunteers and individuals who helped with the scanning at each imaging site.

DATA AVAILABILITY STATEMENT#

An interactive preprint of this manuscript is available at http://rrsg2020.github.io/paper. All imaging data submitted to the challenge, dataset details, registered ROI maps, and processed T1 maps are hosted on OSF https://osf.io/ywc9g/. The dataset submissions and quality assurance were handled through GitHub issues in this repository https://github.com/rrsg2020/data_submission (commit: 9d7eff1). Note that accepted submissions are closed issues, and that the GitHub branches associated with the issue numbers contain the Dockerfile and Jupyter Notebook scripts that reproduce these preliminary quality assurance results and can be run in a browser using MyBinder. The ROI registration scripts for the phantoms and T1 fitting pipeline to process all datasets are hosted in this GitHub repository, https://github.com/rrsg2020/t1_fitting_pipeline (commit: 3497a4e). All the analyses of the datasets were done using Jupyter Notebooks and are available in this repository, https://github.com/rrsg2020/analysis (commit: 8d38644), which also contains a Dockerfile to reproduce the environment using a tool like MyBinder. A dashboard was developed to explore the datasets information and results in a browser, which is accessible here, https://rrsg2020.dashboards.neurolibre.org, and the code is also available on GitHub: https://github.com/rrsg2020/rrsg2020-dashboard (commit: 6ee9321).

References#

1(1,2,3,4,5,6,7,8,9,10)

Octavia Bane, Stefanie J Hectors, Mathilde Wagner, Lori L Arlinghaus, Madhava P Aryal, Yue Cao, Thomas L Chenevert, Fiona Fennessy, Wei Huang, Nola M Hylton, Jayashree Kalpathy-Cramer, Kathryn E Keenan, Dariya I Malyarenko, Robert V Mulkern, David C Newitt, Stephen E Russek, Karl F Stupic, Alina Tudorica, Lisa J Wilmes, Thomas E Yankeelov, Yi-Fei Yen, Michael A Boss, and Bachir Taouli. Accuracy, repeatability, and interplatform reproducibility of T1 quantification methods used for DCE-MRI: results from a multicenter phantom study. Magn. Reson. Med., 79(5):2564–2575, May 2018. doi:10.1002/mrm.26903.

2(1,2)

Joëlle K Barral, Erik Gudmundson, Nikola Stikov, Maryam Etezadi-Amoli, Petre Stoica, and Dwight G Nishimura. A robust methodology for in vivo T1 mapping. Magn. Reson. Med., 64(4):1057–1067, October 2010. doi:10.1002/mrm.22497.

3

Agah Karakuzu, Stefan Appelhoff, Tibor Auer, Mathieu Boudreau, Franklin Feingold, Ali R Khan, Alberto Lazari, Chris Markiewicz, Martijn Mulder, Christophe Phillips, and others. Qmri-bids: an extension to the brain imaging data structure for quantitative magnetic resonance imaging data. Scientific data, 9(1):517, 2022. doi:10.1038/s41597-022-01571-4.

4

Agah Karakuzu, Labonny Biswas, Julien Cohen-Adad, and Nikola Stikov. Vendor-neutral sequences and fully transparent workflows improve inter-vendor reproducibility of quantitative mri. Magnetic Resonance in Medicine, 88(3):1212–1228, 2022. doi:10.1002/mrm.29292.

5(1,2)

Kathryn E Keenan, Zydrunas Gimbutas, Andrew Dienstfrey, Karl F Stupic, Michael A Boss, Stephen E Russek, Thomas L Chenevert, P V Prasad, Junyu Guo, Wilburn E Reddick, Kim M Cecil, Amita Shukla-Dave, David Aramburu Nunez, Amaresh Shridhar Konar, Michael Z Liu, Sachin R Jambawalikar, Lawrence H Schwartz, Jie Zheng, Peng Hu, and Edward F Jackson. Multi-site, multi-platform comparison of MRI T1 measurement using the system phantom. PLoS One, 16(6):e0252966, June 2021. doi:10.1371/journal.pone.0252966.

6

Alberto Lazari and Ilona Lipp. Can MRI measure myelin? systematic review, qualitative assessment, and meta-analysis of studies validating microstructural imaging with myelin histology. Neuroimage, 230:117744, April 2021. doi:10.1016/j.neuroimage.2021.117744.

7

Matteo Mancini, Agah Karakuzu, Julien Cohen-Adad, Mara Cercignani, Thomas E Nichols, and Nikola Stikov. An interactive meta-analysis of MRI biomarkers of myelin. Elife, October 2020. doi:10.55458/neurolibre.00004.

8

Stefan K Piechnik, Vanessa M Ferreira, Adam J Lewandowski, Ntobeko A B Ntusi, Rajarshi Banerjee, Cameron Holloway, Mark B M Hofman, Daniel M Sado, Viviana Maestrini, Steven K White, Merzaka Lazdam, Theodoros Karamitsos, James C Moon, Stefan Neubauer, Paul Leeson, and Matthew D Robson. Normal variation of magnetic resonance T1 relaxation times in the human population at 1.5 T using ShMOLLI. J. Cardiovasc. Magn. Reson., 15(1):13, January 2013. doi:10.1186/1532-429X-15-13.

9

Greg J Stanisz, Ewa E Odrobina, Joseph Pun, Michael Escaravage, Simon J Graham, Michael J Bronskill, and R Mark Henkelman. T1, T2 relaxation and magnetization transfer in tissue at 3T. Magn. Reson. Med., 54(3):507–512, September 2005. doi:10.1002/mrm.20605.


1

Strictly speaking, not all manufacturers operate a 3.0 T. Even though this is the field strength advertised by the system manufacturers, there is some deviation in actual field strength between vendors. The actual center frequencies are typically reported in the DICOM files, and these were shared for most datasets and are available in our OSF.io repository (https://osf.io/ywc9g/). From these datasets, the center frequencies imply participants that used GE and Philips scanners were at 3.0T (~127.7 MHz), whereas participants that used Siemens scanners were at 2.89T (~123.2 MHz). For simplicity, we will always refer to the field strength in this article as 3T.

2

Only T1 maps measured using phantom version 1 were included in this inter-submission COV, as including both sets would have increased the COV due to the differences in reference T1 values. There were seven participants that used version 1, and six that used version 2.

3

Due to the noise-floor or artifacts.

4

See online dashboard https://rrsg2020.dashboards.neurolibre.org/apps/stats and click on the “HSF” tab to view these graphs for all 14 spheres.